high-dimensional data inference
Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference
Estimating insurance premia from data is a difficult regression problem for several reasons: the large number of variables, many of which are .discrete, We compare several machine learning methods for estimating insurance premia, and test them on a large data base of car insurance policies. We find that func(cid:173) tion approximation methods that do not optimize a squared loss, like Support Vector Machines regression, do not work well in this context. Compared methods include decision trees and generalized linear models. The best results are obtained with a mixture of experts, which better identifies the least and most risky contracts, and allows to reduce the median premium by charging more to the most risky customers.
Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference
Chapados, Nicolas, Bengio, Yoshua, Vincent, Pascal, Ghosn, Joumana, Dugas, Charles, Takeuchi, Ichiro, Meng, Linyan
This conditional expected claim amount is called the pure premium and it is the basis of the gross premium charged to the insured. This expected value is conditionned on information available about the insured and about the contract, which we call input profile here. This regression problem is difficult for several reasons: large number of examples, -large number variables (most of which are discrete and multi-valued), non-stationarity of the distribution, and a conditional distribution of the dependent variable which is very different from those usually encountered in typical applications.of
Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference
Chapados, Nicolas, Bengio, Yoshua, Vincent, Pascal, Ghosn, Joumana, Dugas, Charles, Takeuchi, Ichiro, Meng, Linyan
This conditional expected claim amount is called the pure premium and it is the basis of the gross premium charged to the insured. This expected value is conditionned on information available about the insured and about the contract, which we call input profile here. This regression problem is difficult for several reasons: large number of examples, -large number variables (most of which are discrete and multi-valued), non-stationarity of the distribution, and a conditional distribution of the dependent variable which is very different from those usually encountered in typical applications.of
Estimating Car Insurance Premia: a Case Study in High-Dimensional Data Inference
Chapados, Nicolas, Bengio, Yoshua, Vincent, Pascal, Ghosn, Joumana, Dugas, Charles, Takeuchi, Ichiro, Meng, Linyan
This conditional expected claim amount is called the pure premium and it is the basis of the gross premium charged to the insured. This expected value is conditionned on information available about the insured and about the contract, which we call input profile here. This regression problem is difficult for several reasons: large number of examples, -large number variables (most of which are discrete and multi-valued), non-stationarity of the distribution, and a conditional distribution of the dependent variable which is very different from those usually encountered in typical applications .of